20 research outputs found
Memory Properties Of Transformations Of Linear Processes And Symmetric Gini Correlation
A large class of time series processes can be modeled by linear processes, including a subset of the fractional ARIMA process. Transformation of linear processes is one of the most popular topics in univariate time-series analysis in recent years. In this dissertation, we study the memory properties of transformations of linear processes. Our results show that the transformations of short-memory time series still have short-memory and the transformation of long-memory time series may have different weaker memory parameters which depend on the power rank of the transformation. In particular, we provide the memory parameters of the FARIMA (p,d,q) processes. As an example, the memory properties of call option processes at different strike prices are discussed in details. When we develop the memory properties of transformation of linear processes, we use the Pearson correlation to measure the memory. Correlation is another big topic in statistics, which is used to measure the dependence of stochastic processes or random variables. Standard Gini correlation is one of the correlations to measure the dependence between random variables with heavy tailed distributions. However, the asymmetry of Gini covariance and correlation brings a substantial difficulty in interpretation. In this dissertation, we propose a symmetric Gini-type covariance and correlation (ρg) based on the joint rank function. The proposed correlation ρg is symmetric and is more robust than the Pearson correlation but less robust than the Kendall\u27s τ correlation in terms of influence functions. Furthermore, we establish the relationship between ρg and the linear correlation ρ for a class of random vectors in the family of elliptical distributions, which allows us to estimate ρ based on estimation of ρg. We compare asymptotic efficiencies of linear correlation estimators based on the symmetric Gini, and the proposed measure ρg shows superior finite sample performance, which makes it attractive in applications
Grouped feature screening for ultrahigh-dimensional classification via Gini distance correlation
Gini distance correlation (GDC) was recently proposed to measure the
dependence between a categorical variable, Y, and a numerical random vector, X.
It mutually characterizes independence between X and Y. In this article, we
utilize the GDC to establish a feature screening for ultrahigh-dimensional
discriminant analysis where the response variable is categorical. It can be
used for screening individual features as well as grouped features. The
proposed procedure possesses several appealing properties. It is model-free. No
model specification is needed. It holds the sure independence screening
property and the ranking consistency property. The proposed screening method
can also deal with the case that the response has divergent number of
categories. We conduct several Monte Carlo simulation studies to examine the
finite sample performance of the proposed screening procedure. Real data
analysis for two real life datasets are illustrated.Comment: 25 pages, 1 figur
Two symmetric and computationally efficient Gini correlations
© 2020 Courtney Vanderford et al., published by De Gruyter. Standard Gini correlation plays an important role in measuring the dependence between random variables with heavy-tailed distributions. It is based on the covariance between one variable and the rank of the other. Hence for each pair of random variables, there are two Gini correlations and they are not equal in general, which brings a substantial difficulty in interpretation. Recently, Sang et al (2016) proposed a symmetric Gini correlation based on the joint spatial rank function with a computation cost of O(n2) where n is the sample size. In this paper, we study two symmetric and computationally efficient Gini correlations with the computational complexity of O(n log n). The properties of the new symmetric Gini correlations are explored. The influence function approach is utilized to study the robustness and the asymptotic behavior of these correlations. The asymptotic relative efficiencies are considered to compare several popular correlations under symmetric distributions with different tail-heaviness as well as an asymmetric log-normal distribution. Simulation and real data application are conducted to demonstrate the desirable performance of the two new symmetric Gini correlations
Asymptotic Normality of Gini Correlation in High Dimension with Applications to the K-sample Problem
The categorical Gini correlation proposed by Dang et al. is a dependence
measure between a categorical and a numerical variables, which can characterize
independence of the two variables. The asymptotic distributions of the sample
correlation under the dependence and independence have been established when
the dimension of the numerical variable is fixed. However, its asymptotic
distribution for high dimensional data has not been explored. In this paper, we
develop the central limit theorem for the Gini correlation for the more
realistic setting where the dimensionality of the numerical variable is
diverging. We then construct a powerful and consistent test for the K-sample
problem based on the asymptotic normality. The proposed test not only avoids
computation burden but also gains power over the permutation procedure.
Simulation studies and real data illustrations show that the proposed test is
more competitive to existing methods across a broad range of realistic
situations, especially in unbalanced cases.Comment: 31 pages, 3 figure